A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications

We review the literature on approximate dynamic programming, with the goal of better understanding the theory behind practical algorithms for solving dynamic programs with continuous and vector-valued states and actions and complex information processes. We build on the literature that has addressed the well-known problem of multidimensional (and possibly continuous) states, and the extensive l...

متن کامل

Batch Policy Iteration Algorithms for Continuous Domains

This paper establishes the link between an adaptation of the policy iteration method for Markov decision processes with continuous state and action spaces and the policy gradient method when the differentiation of the mean value is directly done over the policy without parameterization. This approach allows deriving sound and practical batch Reinforcement Learning algorithms for continuous stat...

متن کامل

Convergence Analysis of Kernel-based On-policy Approximate Policy Iteration Algorithms for Markov Decision Processes with Continuous, Multidimensional States and Actions

Using kernel smoothing techniques, we propose three different online, on-policy approximate policy iteration algorithms which can be applied to infinite horizon problems with continuous and vector-valued states and actions. Using Monte Carlo sampling to estimate the value function around the post-decision state, we reduce the problem to a sequence of deterministic, nonlinear programming problem...

متن کامل

Algorithms and Bounds for Rollout Sampling Approximate Policy Iteration

Several approximate policy iteration schemes without value functions, which focus on policy representation using classifiers and address policy learning as a supervised learning problem, have been proposed recently. Finding good policies with such methods requires not only an appropriate classifier, but also reliable examples of best actions, covering the state space sufficiently. Up to this ti...

متن کامل

existence and approximate $l^{p}$ and continuous solution of nonlinear integral equations of the hammerstein and volterra types

بسیاری از پدیده ها در جهان ما اساساً غیرخطی هستند، و توسط معادلات غیرخطی ‎‏بیان شد‎‎‏ه اند. از آنجا که ظهور کامپیوترهای رقمی با عملکرد بالا، حل مسایل خطی را آسان تر می کند. با این حال، به طور کلی به دست آوردن جوابهای دقیق از مسایل غیرخطی دشوار است. روش عددی، به طور کلی محاسبه پیچیده مسایل غیرخطی را اداره می کند. با این حال، دادن نقاط به یک منحنی و به دست آوردن منحنی کامل که اغلب پرهزینه و ...

15 صفحه اول

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Control Theory and Applications

سال: 2011

ISSN: 1672-6340,1993-0623

DOI: 10.1007/s11768-011-0313-y